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Summary. — In this contribution we briefly report on the progress and open prob- 
lems in parton distribution functions (PDFs), with emphasis on their implications 
for LHC phenomenology. Then we study the impact of the recent ATLAS and CMS 
W lepton asymmetry data on the NNPDF2.1 parton distributions. We show that 
these data provide the first constrains on PDFs from LHC measurements. 
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1. Progress and open problems in parton distributions 

The quantitative control of the Standard Model contribution to collider signal and 
background processes at the few percent level is a necessary ingredient not only for pre- 
cision physics, but also for discovery at the LHC. The precision determination of parton 
distribution functions (PDFs) is essential in order to achieve this level of theoretical 
accuracy. 

There has been substantial progress in PDF analysis in the last years, and it is thus 
impossible to review it in detail in this contribution. A recent concise report of the 
status of the field can be found in Rcf. [1], while more detailed reviews can be found in 
Refs. [2-5]. In this contribution we restrict ourselves to highlight some important topics 
in PDF determinations. First of all, we will sketch the current status of PDF fits and 
discuss some of the open problems in the field. Then we will discuss how the ATLAS 
and CMS measurements of the W lepton asymmetry data provide the first constraints 
on PDFs from the LHC, thus paving the way for PDFs based on LHC data. 

PDF analysis have entered the era in which they can be considered as a quantitative 
science. An ideal PDF determination should satisfy several important requirements [2]. 
These include being based on a datasct which is as wide as possible, in order to ensure 
that all relevant experimental information is retained, to use a sufficiently general and 
unbiased parton parametrization and to provide statistically consistent confidence levels 
for PDF uncertainties. Moreover, such ideal set should include heavy quark mass effects 
through a GM-VFN scheme [6] and be based on computations performed at the highest 
available pcrturbative order. Finally, PDF sets should be provided for a variety of values 
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Table I. - Summary of the features of the most updated PDF sets from each group. The CT10, 
MSTW08 and NNPDF2.1 sets include data from a wide variety of physical processes and are 
thus called global PDF sets. See text for more details. 



Ref I Dataset Parametrization I PDF uncertainties 



ABKM09 


[14] 


DIS+DY 


Polynomial 


Hessian, standard tol. 


CT10 


[15] 


DIS+DY+W/Z+jet 


Polynomial 


Hessian, dyn. tol. 


HERAPDF1.0 


[16] 


DIS 


Polynomial 


Hessian, standard tol. 


JR08 


[17] 


DIS+DY+jet 


Polynomial 


Hessian, fixed tol. 


MSTW08 


[18] 


DIS+DY+W/Z+jet 


Polynomial 


Hessian, dyn. tol. 


NNPDF2.1 


[12] 


DIS+DY+W/Z+jet 


Neural Nets 


Monte Carlo 



PT order Heavy Quarks I Strong coupling 



ABKM09 


NLO/NNLO 


FFNS 


Fitted 


CT10 


NLO 


S-ACOT-x 


Fixed + range of values 


HERAPDF1.0 


NLO 


TR 


Fixed 


JR08 


NLO/NNLO 


FFNS 


Fitted 


MSTW08 


LO/NLO/NNLO 


TR 


Fitted + range of values 


NNPDF2.1 


NLO 


FONLL-A 


Fixed + range of values 



of a s , reasonably thinly spaced, similarly for the heavy quark masses, and should include 
an estimate of uncertainties related to the truncation of the perturbative expansion. 
While for each of these aspects there has been sizable progress in the recent years, still 
no PDF sets fulfills all these conditions. 

One important development in PDFs in the recent years has been the NNPDF ap- 
proach [7-11]. Thanks to a combination of Monte Carlo techniques and the use of artificial 
neural networks, the NNPDF approach avoids some of the drawbacks of the standard 
approach like the bias due to the arbitrary choice of input functional forms or the use of 
linear approximations for PDF uncertainty estimation. The most updated NNPDF set 
is NNPDF2.1 [12], an unbiased NLO global fit of all relevant hard scattering data based 
on the FONLL-A GM-VFN scheme [13]. 

Several groups provide regular updates of their PDF sets: in alphabetic order these 
are ABKM, CT, HERAPDF, JR, MSTW and NNPDF. In Table I we summarize some 
of the features of the most updated PDF sets from each collaboration. We consider only 
those sets available in the LHAPDF library. We compare the dataset, parametrization, 
method to estimate PDF uncertainties, perturbative order at which PDFs are available, 
the theoretical schemes adopted to include heavy quark mass effects and the treatment of 
the strong coupling a s . More details on each of these issues can be found in Refs. [2,4,5], 
as well as in the original publications of each group. 

The main difference arises from the data sets used in each of the various analysis. 
The CT10, MSTW08 and NNPDF2.1 sets include data from a wide variety of physical 
processes and are thus called global PDF sets. Other PDF sets use more restrictive 
subsets, like ABKM09, which excludes Tevatron jet and weak vector production data 
and HERAPDF1.0, that is based solely on HERA data. 

PDFs are typically parametrized with relatively simple functional forms like q(x, Qq) ~ 
x a (l — x) b P(x,c,d, . . .) with P a polynomial that interpolates between the small and 
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Fig. 1. - Comparison of the NLO total cross sections for W + and ti production and their 
combined PDF+« S uncertainties at the LHC 7 TeV between the most updated PDF sets of 
each group. Plots from G. Watt. 
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large-a; regions. These unjustified theoretical assumptions introduce a potentially large 
functional form bias in PDF determinations. The NNPDF approach bypasses this prob- 
lem using neural networks as universal unbiased interpolants. Related techniques for 
general PDF parametrizations like Chebishev polynomials have also been discussed in 
the literature [19,20]. 

PDF uncertainties are estimated by all groups (but NNPDF) using the Hessian 
method. However, different choices for the tolerance T = y/ A\ 2 adopted to define 
1-sigma PDF uncertainties are used. For example, while HERAPDF1.0 and ABKM08 
are based on a textbook tolerance A% 2 = 1, MSTW08 and CT10 adopt a dynamical 
tolerance criterion that results in tolerances A% 2 > 1, which are moreover different for 
each eigenvector direction. The need for large tolerances has been suggested to partly 
arise when restrictive input functional forms are used [20]. NNPDF, on the other hand, 
is based on the Monte Carlo approach, that is, a sampling in the space of experimental 
data, that allows an exact uncertainty propagation from data to PDFs and from these 
to physical observables. 

Recently, a detailed benchmarking of the predictions for relevant LHC observables 
from modern NLO PDF sets was performed in the context of the PDF4LHC working 
group [5]. In Fig. 1 we compare the NLO predictions for different PDF sets for two 
important LHC observables, the total W + and ti cross sections. One of the conclusions 
from that study is that the agreement between global PDF sets is reasonable for most 
LHC processes, much better than for sets based on restrictive datasets. However, it was 
also clear that even within global sets there are important discrepancies whose origin 
needs still to be understood, related for example to the large- a; gluon and to strangeness. 
Another recent benchmark study, this time at NNLO, was presented in [21]. 

The PDF4LHC exercise allowed to elucidate differences and similarities between PDF 
sets. In particular it showed that the most important source of difference between sets 
is the choice of fitted data. This study was the basis of the PDF4LHC recommenda- 
tion [22], that suggests to take the envelope of the combined PDF+a s uncertainties from 
the three global PDF sets, CT10, MSTW08 and NNPDF2.1, to estimate the PDF+a s 
uncertainty on LHC processes. The PDF4LHC has been adopted by ATLAS and CMS in 
those analysis sensitive to PDFs, and in particular the LHC Higgs cross section working 



4 



J. ROJO 



Fig. 2. - Left plot: Comparison of the NLO Higgs production cross section with the combined 
PDF+q s uncertainties from NNPDF2.0, MSTW08 and CTEQ6.6, and the resulting PDF4LHC 
recipe [22] envelope, from Ref. [1]. Right plot: comparison of the MSTW08 and the preliminary 
NNPDF2.1 NNLO predictions for the NNLO Higgs production cross section. For the MSTW08 
prediction two values of a s have been used. 
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group [1] uses the PDF4LHC recipe to estimate the combined PDF+a s uncertainty in 
their theoretical predictions, sec Fig. 2. The same recipe has been used to derive the 
most updated Tevatron Higgs exclusion limits [23]. 

Let us now turn to discuss some open problems in PDF fits: the treatment of a s , 
Higgs production at hadron colliders and deviations from DGLAP in HERA data. The 
treatment of the strong coupling in PDF fits is a source of differences between sets, as 
summarized in Tabic I. Some groups, like MSTW or ABKM, determine a s simultane- 
ously with the PDFs, while others, like CT or NNPDF, take for a s a fixed value close 
to the PDG average [24], a s (M z ) = 0.1184 ± 0.0007 in the latest update. Differences 
between PDF sets are reduced when a common value of a s is used, as shown also in the 
comparison plots of Fig. 1. 

Let us emphasize that the choice of fixing a s to the PDG value in the reference PDF 
set is not necessarily related to the sensitivity of a given PDF analysis to a s . Rather, it 
reflects the idea than the average of a s from a wide range of processes, including some 
like t decays unrelated to the proton structure, is necessarily more accurate than the 
determination from a single PDF fit. For example, NNPDF [25] has recently performed 
a NLO determinations of the strong coupling, finding good consistency with the PDG 
value: a s (Mz) = 0.1191 ± 0.0006, where the uncertainty is purely statistical. 

The treatment of a s is closely related to one of the most important process at the 
LHC, the Higgs production cross section in its dominant production channel of gluon 
fusion. This process is very sensitive to a s [26] , since the partonic cross section depends 
as O (a~) already at leading order, and has received a lot of attention recently due 
to claims that theoretical uncertainties were being underestimated. Preliminary NNLO 
results from NNPDF, shown in Fig. 2, suggest a reasonable agreement with the MSTW08 
NNLO prediction, as was already the case at NLO, thus confirming the PDF4LHC recipe 
estimates. It is also clear how the use of a common value of a s improves further the 
agreement between the two sets. 
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Fig. 3. - The kinematic coverage in the (x, Q 2 ) plane for W production at the LHC in the central 
(ATLAS and CMS) and forward (LHCb) regions. 




Another open problem in PDF determinations are the potential departures from fixed- 
order DGLAP evolution in small- x and Q 2 HERA data. The analysis of Rcfs. [27, 
28] found evidence for deviations from NLO DGLAP in the small-x combined HERA-I 
data, consistent with small- x resummation and non-linear dynamics but not with NNLO 
corrections. This effect has been confirmed by the HERAPDF analysis, which also finds 
a worse fit quality at NNLO for the small- a; data. A related CT10 [29] analysis found 
some hints as well but it was restricted to the use of few functional forms for the small- x 
PDFs. If deviations from DGLAP for low- a; HERA data are confirmed, this suggests 
that small-x resummation [30] is a necessary ingredient in order to use all the potential 
of HERA data for precision LHC physics. 

2. Constraining PDFs with LHC W asymmetry data 

We now turn to discuss the first constraints on PDFs from LHC data, provided by the 
ATLAS [31] and CMS [32] measurements of the leptonic W asymmetry( 1 ). As it is well 
known, W production at hadron colliders is sensitive to the light quark and antiquark 
PDFs at medium and small-x. The kinematic coverage of W production at the LHC is 
summarized in Fig. 3. We have studied the impact of the W asymmetry data using the 
Bayesian reweighting method of Ref. [33] . Bayesian reweighting is a powerful technique 
to efficiently determine the impact of new data into PDFs without the need of refitting. 
This method also allows to determine the internal consistency of the data sets and their 
compatibility with the global fit. 

A detailed discussion of the impact of LHC data on NNPDF will be presented else- 



( ) There exist as well preliminary data from LHCb that will be sensitive to even smaller and 
larger values of x, see Fig. 3. 
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Fig. 4. - The ATLAS and CMS W lepton asymmetry data compared to the NNPDF2.1 predic- 
tions before and after reweighting. 
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where. In this contribution we restrict ourselves to some selected preliminary results. 
We will show results for the impact of the combined ATLAS and CMS data. In the 
case of CMS we consider the more inclusive dataset (with the cut in lepton transverse 
momentum of p t > 25 GeV) and both electrons and muons. For ATLAS only the muon 
asymmetry has been presented. The theoretical predictions have been computed with 
the DYNNLO generator [34] at NLO accuracy for NNPDF2.1. The kinematic cuts arc 
the same as in the respective experimental analyses. 

In Fig. 4 we compare the ATLAS and CMS lepton asymmetry data with the NNPDF2.1 
predictions before and after including the effect of these data sets. We notice that the 
data is already nicely consistent with the NNPDF2.1 prediction within the respective 
uncertainties. After including the LHC measurements, one finds that the W asymmetry 
data constraints the PDF uncertainties and leads to an even better agreement with the 
data. A more detailed statistical analysis confirms that the ATLAS and CMS data are 
consistent between them and with the experiments included in the global PDF analysis. 
After reweighting, the x 2 per data point of the combined CMS and ATLAS data is <~ 1. 

Next, in Fig. 5 we show the constraints on the PDFs provided by the combined 
ATLAS and CMS W asymmetry data. We find that the PDF uncertainties are reduced 
for medium and small- a; light quark and antiquarks, by a factor that can be as large as 
~ 30-40%. The impact on other PDFs is smaller. The central PDF prediction is almost 
unaffected by the LHC data, confirming further the consistency of the W asymmetry 
measurements with the global fit. At large- a; the constrains are weaker, as expected from 
the kinematic coverage shown in Fig. 3. Upcoming measurements of this asymmetry by 
LHCb might help in reducing PDF uncertainties in the large- a; region. 

Note that these preliminary results have been derived from a sample of only N rcp = 
100 Monte Carlo replicas. This means that there can be non-negligible fluctuations and 
explains why PDF uncertainties are apparently reduced even at very small- a;, outside the 
kinematic coverage of the ATLAS and CMS data. 

To summarize, we have shown that the W lepton asymmetry is the first dataset from 
the LHC that has the precision to constrain PDFs and thus improve the accuracy of 
Standard Model computations for LHC processes. We have quantified this impact on 
the light quark and antiquark PDFs, and found that PDF uncertainties can be reduced 
by factors up to ~ 40% at medium and small- a;. More constrains on PDFs should soon 
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Fig. 5. - The impact of the ATLAS and CMS lepton asymmetry data on the relative uncertainty 
of the light quark and antiquark NNPDF2.1 PDFs. 
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be available from upcoming LHC measurements. 
3. Outlook 



In this contribution we have briefly reviewed recent developments and open problems 
related to PDFs, with emphasis on their implications for the LHC physics program. 
While our understanding of the proton structure has seen a huge progress in the recent 
years, there are still open questions that need to be answered, and that are important to 
improve even further the accuracy of theoretical predictions at the LHC. We have also 
presented preliminary results on the impact of the LHC W lepton asymmetry data on 
the NNPDF2.1 set. We have shown that these data provide the first constraints on PDFs 
from LHC measurements, in particular they help to pin down with better accuracy the 
medium and small x light quarks and antiquarks. 

In the medium term, LHC measurements will provide very important constraints on 
most PDF combinations. This will allow parton distributions to be derived solely from 
collider data: HERA, Tevatron and the LHC. Collider data is more robust theoretically 
and experimentally than low-energy fixed target data, that now provide basic constrains 
in global PDF analysis. In order to achieve this program, several measurements will be 
provided by the LHC: Z-boson rapidity distributions, low mass Drcll-Yan differential 
distributions, high Et jets and photons, and W/Z production in association with heavy 
quarks. The increased experimental and theoretical accuracy on PDFs determined this 
way will provide a solid ground for precision Standard Model predictions and searches 
for new physics at the LHC. 

* * * 

I would like to thank the La Thuile 2011 organizers for their kind invitation to present 
this review. I thank all the members of the NNPDF Collaboration for endless discussions 
on PDFs, specially S. Forte. I thank G. Watt for providing the plots in Fig. 1. 
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